regular expression

Terms from Artificial Intelligence: humans at the heart of algorithms

Page numbers are for draft copy at present; they will be replaced with correct numbers when final book is formatted. Chapter numbers are correct and will not change now.

Regular expressions are a form of pattern that are often be used in data detectors to recognise entities in text or during data wrangling to validate data. Regular expression are a very simple formal grammar with limited expressivity, for example they cannot match brackets. There are slightly different flavours but typically include matching fixed text, grouping with brackets, alternatives '(acd|def)', characters in sets '[abc789]', repetitions 'a*' and optional items 'a?' `and wildcards '.'. For example, '(a(bc|de)f*)' matches 'abc', 'ade', 'abcf', 'adef', abcff', adeff', ... and so on with and more 'f's at the end.

Used on Chap. 10: page 201; Chap. 14: pages 327, 328, 329, 340; Chap. 17: pages 406, 417